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Introduction 


An  important  challenge  in  prostate  cancer  research  is  to  develop  effective  predictors  of  tumor 
recurrence  following  surgery  in  order  to  determine  whether  immediate  adjuvant  therapy  is  warranted. 
To  identify  biomarkers  predictive  of  biochemical  recurrence,  we  isolated  the  RNA  from  70  formalin- 
fixed  paraffin-embedded  (FFPE)  radical  prostatectomy  specimens  with  known  long  term  outcome  to 
perform  DASL  expression  profiling  with  a  custom-designed  panel  of  522  prostate  cancer  relevant 
genes  that  we  designed.  We  identified  a  panel  of  ten  protein-coding  genes  and  two  miRNA  genes 
(RAD23B,  FBP1 ,  TNFRSF1  A,  CCNG2,  NOTCH3,  ETV1 ,  BID,  SIM2,  ANXA1 ,  miR-519d,  and  miR- 
647)  that  could  be  used  to  separate  patients  with  and  without  biochemical  recurrence  (p  <  0.001),  as 
well  as  for  the  subset  of  42  Gleason  score  7  patients  (p  <  0.001).  We  performed  an  independent 
validation  analysis  on  40  samples  and  found  that  the  biomarker  panel  was  also  significant  at 
prediction  of  biochemical  recurrence  for  all  cases  (p  =  0.013)  and  for  a  subset  of  19  Gleason  score  7 
cases  (p  =  0.010),  both  of  which  were  adjusted  for  relevant  clinical  information  including  T-stage, 
PSA  and  Gleason  score.  Importantly,  these  biomarkers  could  significantly  predict  clinical  recurrence 
for  Gleason  score  7  patients.  A  manuscript  describing  these  biomarkers  is  now  in  press  at  the 
American  Journal  of  Pathology 1 . 

In  March  2010,  we  received  funding  for  a  two-year  Prostate  Cancer  IDEA  Award  to  validate 
and  improve  this  set  of  biomarkers  on  an  independent,  large  set  of  patient  samples.  Over  the  past 
year,  we  have  dealt  with  a  large  number  of  administrative  hurdles  to  obtain  the  necessary  approvals 
to  proceed  with  the  research  funded  by  this  award.  Now  that  all  of  these  hurdles  have  been 
overcome,  we  are  beginning  to  implement  our  research  plan  in  earnest.  Because  of  the  delays  in 
the  beginning  of  the  project,  we  are  requesting  a  no-cost  extension  of  the  project  from  a  2-year  to  a 
3-year  project. 
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Body 


Custom  Prostate  DASL  profiling 

Funding  for  this  IDEA  Award  was  based  on  our  DASL  expression  profiling  data  using  our 
custom-designed  prostate  cancer  panel  and  the  lllumina  DASL  microRNA  (miRNA)  panel  on  70 
prostatectomy  patient  samples  to  identify  biomarkers  predictive  of  recurrence.  In  addition,  an 
independent  validation  profiling  experiment  was  performed  on  40  additional  samples.  MiRNA  probes 
were  filtered  to  retain  only  those  that  were  present  on  the  miRNA  microarrays  used  for  both  the 
training  and  validation  sets,  reducing  the  total  number  of  probes  examined  to  403  miRNA  probes. 
The  training  set  included  29  cases  with  observed  biochemical  PSA  recurrence  (median  time  to 
recurrence  =  19  months),  and  41  cases  censored,  i.e.,  without  observed  recurrence  during  follow-up 
(median  follow-up  time  =  83.0  months).  A  summary  of  the  clinical  characteristics  of  the  training  and 
validation  sets  of  samples  is  provided  in  Table  1. 

Integrated  DASL  biomarker  analysis 

After  fitting  a  univariate  Cox  proportional  hazard  (PH)  model  for  each  individual  probe  using 
the  training  data,  a  set  of  27  important  probes  were  preselected  based  on  an  FDR  threshold  of  0.30. 
Next,  to  identify  the  optimal  prediction  score  based  on  the  preselected  probes,  we  fit  a  lasso  Cox 
proportional  hazard  (PH)  model2,3  first  using  the  set  of  25  preselected  mRNA  probes  only,  resulting 
in  a  panel  of  nine  protein-coding  genes  shown  in  Table  2  ( RAD23B ,  FBP1,  TNFRSF1A,  NOTCFI3, 
ETV1,  BID,  SIM2,  ANXA1,  and  BCL2).  A  final  prediction  model  was  then  built  to  include  the 
predictive  score  based  on  this  panel  of  nine  mRNA  biomarkers  as  well  as  the  relevant  clinical 
biomarkers  including  T-stage,  PSA  and  Gleason  score,  which  could  be  used  to  predict  recurrence 
following  radical  prostatectomy.  Kaplan-Meier  analysis  (Figure  1A)  demonstrated  that  these  probes 
could  significantly  discriminate  patients  at  higher  and  lower  risk  of  recurrence  by  the  log  rank  test  (p 
<  0.001).  We  next  applied  the  final  predictive  model  developed  on  the  training  set  to  the  validation 
set,  a  separate,  independent  DASL  profiling  experiment  performed  on  a  different  day.  Kaplan-Meier 
analysis  (Figure  IB)  on  this  validation  set  determined  that  the  model  could  discriminate  patients  at 
higher  and  lower  risk  of  recurrence  (p  =  0.010). 

Subsequently,  we  repeated  the  above  training  procedure  using  the  complete  set  of  27 
preselected  mRNA  and  miRNA  probes,  and  we  identified  an  optimal  panel  of  ten  mRNAs  and  two 
microRNAs  (Table  3)  and  built  a  final  prediction  model  for  prostate  cancer  biochemical  recurrence, 
which  again  included  relevant  clinical  biomarkers.  Kaplan-Meier  analysis  and  the  log-rank  test 
determined  that  this  panel  could  also  significantly  discriminate  patients  at  higher  and  lower  risk  of 
recurrence  both  in  the  training  set  (p  <  0.001 ,  Figure  1C)  and  in  the  validation  set  (p  =  0.013,  Figure 
ID). 

Prediction  of  Cases  with  a  Gleason  Score  7 

Prediction  of  recurrence  for  patients  with  a  Gleason  score  7  is  particularly  difficult.  In  order  to 
address  this  issue,  we  applied  the  biomarker  panels  to  the  subset  of  cases  in  the  training  and 
validation  sets  that  had  a  Gleason  score  7.  The  prediction  model  based  on  the  nine-mRNA  panel 
was  significant  at  discriminating  biochemical  recurrence  in  Gleason  score  7  cases  in  both  the 
training  set  (p  <  0.001 ,  Figure  2A)  and  the  validation  set  (p  =  0.027,  Figure  2B).  For  the  prediciton 
model  based  on  the  combined  panel  of  ten  mRNAs  and  two  miRNAs  in  Table  3,  the  predictive  value 
was  again  significant  for  both  the  training  set  (p  =  <  0.001 ,  Figure  2C)  and  the  validation  set  (p  = 
0.010,  Figure  2D).  A  summary  of  the  p-values  for  predicting  biochemical  recurrence  is  given  in 
Table  4.  In  all  cases,  the  prediction  models  that  use  one  of  the  two  gene  biomarker  panels  plus 
clinical  information  outperforms  the  prediction  model  using  only  clinical  information. 

Analysis  of  clinical  recurrence 

Although  most  patients  who  have  clinical  recurrence  following  prostatectomy  also  have 
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biochemical  recurrence,  there  is  a  significant  population  of  patients  with  biochemical  recurrence  who 
do  not  have  clinically  significant  recurrences  observed  during  their  follow-ups.  To  evaluate  our 
biomarker  panel  of  biochemical  recurrence  for  predicting  the  clinical  recurrence,  we  tested  the 
prediction  model  based  on  the  combined  mRNA/miRNA  panel  in  the  same  training  and  validation 
samples  using  their  clinical  recurrence  outcome  data.  Unfortunately,  clinical  recurrence  data  was 
lacking  on  some  of  the  samples,  and  the  total  number  of  samples  used  in  the  training  set  was 
reduced.  In  the  training  data,  the  combined  mRNA/miRNA  panel  was  highly  significant  for  predicting 
clinical  recurrence  in  all  patients  (p=0.002)  as  well  as  in  the  subset  of  patients  with  a  Gleason  score 
7  (p=0.004);  in  the  validation  data,  it  was  also  significant  for  predicting  recurrence  in  patients  with  a 
Gleason  score  7  (p=0.023)  and  trended  towards  significance  in  all  patients  (p=0.078).  A  summary  of 
the  p-values  for  predicting  clinical  recurrence  is  given  in  Table  5.  In  all  cases,  the  prediction  model 
that  uses  the  combined  mRNA  and  miRNA  panel  plus  the  clinical  information,  again,  outperforms  the 
prediction  model  that  uses  only  the  clinical  information. 

We  also  performed  an  analysis  to  construct  a  predictive  set  of  biomarkers  based  on  the 
clinical  recurrence  data  instead  of  biochemical  recurrence.  Only  three  probes  passed  the  initial 
preselection  step  for  the  univariate  Cox  PH  modeling,  all  corresponding  to  the  ETV1  gene,  which  is 
likely  due  to  the  considerably  fewer  number  of  clinical  recurrences  in  the  training  set  as  well  as  the 
smaller  total  sample  size.  Furthermore,  the  prediction  model  built  on  this  set  of  gene  biomarkers  did 
not  perform  as  well  as  the  models  built  on  biochemical  recurrence  (data  not  shown). 

American  Journal  of  Pathology  Manuscript 

A  manuscript  describing  these  results  was  submitted  to  the  American  Journal  of  Pathology  in 
September,  2010,  a  revision  was  submitted  in  December,  2010,  and  the  paper  was  accepted  for 
publication  in  March,  201 1 .  This  manuscript  is  attached  as  Appendix  1 . 
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TABLES 


Training 

Training 
Set  (No 

Training 

Validation 

Validation 

Validation 

Set  (Total) 

BCR) 

Set  (BCR) 

Set  (Total) 

Set  (No  BCR) 

Set  (BCR) 

Number  Cases 

70 

41 

29 

40 

27 

13 

Clinical 

Recurrence 

8 

0 

8 

11 

0 

11 

No  Clinical 
Recurrence 

57 

41 

16 

29 

27 

2 

Median  Time 

F/U  (months) 

84 

83 

81 

74 

75 

73 

Median  Time  to 
BCR  (months) 

19 

N/A 

19 

14 

N/A 

14 

Median  Time  no 
BCR  (months) 

48 

83 

19 

34.5 

56 

14 

Gleason  Score 
(Avg  +/-  SD) 

6.9  +/-  0.6 

6.7  +/-  0.6 

7.0  +/-  0.6 

7.0  +/-  0.8 

6.8  +/-  0.7 

7.4  +/-  1 

PSA 

(Avg  +/-  SD) 

9.2  +/-  5.4 

8.7  +/-  6.4 

9.9  +/-  3.8 

12.7  +/-  8.4 

12.4  +/-  9.9 

13.1  +/-  5.3 

Age 

(Avg  +/-  SD) 

61.9+/- 7.7 

61.2  +1-1.1 

62.9  +/-  7.8 

63.6  +/-  8.4 

63.5  +/-  8.3 

64  +/-  8.9 

Table  1:  A  summary  of  the  clinical  characteristics  of  the  training  and  validation  sets  of  patient 
samples.  (BCR  =  Biochemical  Recurrence,  F/U  =  follow  up,  PSA  =  prostate  specific  antigen,  SD  = 
standard  deviation). 


Symbol 

Description 

Coefficient 

References 

RAD23B 

RAD23  homolog  B 

0.152 

4,5 

FBP1 

Fructose- 1,6-bisphosphatase  1 

Tumor  Necrosis  Factor  Receptor  Superfamily, 

0.310 

6-8 

TNFRSF1A 

Member  1A 

-0.560 

9,  10 

NOTCH3 

Notch  homolog  3 

0.426 

11,  12 

ETV1 

Ets  Variant  Gene  1  (ETV1) 

0.157 

13,  14 

BID 

BH3  Interacting  Domain  Death  Agonist  (BID) 

0.248 

15,  16 

SIM2 

Single-Minded  Homolog  2 

0.043 

17-20 

ANXA1 

Annexin  A1 

-0.185 

21-24 

BCL2 

B-cell  CLL/lymphoma  2 

0.028 

25,26 

Table  2:  Nine-gene  predictor  of  prostate  cancer  recurrence  following  surgery. 

Coefficient  is 

from  the  lasso  Cox  proportion  hazards  model  and  was  used  for  computing  the  predictive  score. 
Positive  coefficients  indicate  a  positive  association  with  recurrence,  and  negative  coefficients  a 
negative  association  with  recurrence. 
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Symbol 

Description 

Coefficient 

References 

RAD23B 

RAD23  homolog  B 

0.070 

4,5 

FBP1 

Fructose- 1 ,6-bisphosphatase  1 

0.251 

6-8 

TNFRSF1A 

Tumor  necrosis  factor  receptor  superfamily,  member 
1A 

-0.588 

9,  10 

CCNG2 

Cyclin  G2 

0.008 

27-29 

hsa-miR-647 

hsa-miR-647 

-0.318 

LETMD1 

LETM1  domain  containing  1 

0.063 

30-33 

NOTCH3 

Notch  homolog  3 

0.367 

11,  12 

ETV1 

ETS  variant  gene  1  (ETV1) 

0.179 

13,  14 

hsa-miR- 

519d 

hsa-miR-519d 

0.551 

34 

BID 

BH3  interacting  domain  death  agonist  (BID) 

0.128 

15,  16 

SIM2 

Single-minded  homolog  2 

0.124 

17-20 

ANXA1 

Annexin  A1 

-0.143 

21-24 

Table  3:  Twelve-gene  predictor  of  prostate  cancer  recurrence  following  surgery 

using  ten  mRNAs 

and  two  microRNAs.  Coefficient  is  derived  from  the  lasso  Cox  proportion  hazards  model  and  was 
used  for  computing  the  predictive  score.  Positive  coefficients  indicate  a  positive  association  with 
recurrence,  and  negative  coefficients  a  negative  association  with  recurrence. 
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Training  Set 

mRNA  panel 

Combined 

mRNA/miRNA  panel 

Clinical  Information 
Only 

All  Cases  (n=61) 

<0.001 

<0.001 

0.096 

Gleason  score  7 
(n=42) 

<0.001 

<0.001 

0.641 

Validation  Set 

mRNA  panel 

Combined  mRNA/miRNA 
panel 

Clinical  Information 

Only 

All  Cases  (n=35) 

0.010 

0.013 

0.020 

Gleason  score  7 
(n=19) 

0.027 

0.010 

0.028 

Table  4:  Summary  of  p-values  (Logrank  test)  of  prediction  of  biochemical  recurrence  on  training 
and  validation  sets  for  the  entire  dataset  and  the  subset  of  Gleason  score  7  cases  using  two  biomarker 
panels,  all  of  which  are  adjusted  forT-stage,  PSA,  and  Gleason  score,  or  using  clinical  information 
only.  Significant  p-values  are  indicated  in  bold. 


Training  Set 

Combined  mRNA/miRNA  panel 

Clinical  Information  Only 

All  Cases  (n=56) 

0.002 

0.262 

Gleason  score  7  (n=37) 

0.004 

0.136 

Validation  Set 

Combined  mRNA/miRNA  panel 

Clinical  Information  Only 

All  Cases  (n=35) 

0.078 

0.193 

Gleason  score  7  (n=19) 

0.023 

0.080 

Table  5:  Summary  of  p-values  (Logrank  test)  of  prediction  of  clinical  recurrence  on  training  and 
validation  sets  for  the  entire  dataset  and  the  subset  of  Gleason  score  7  cases  using  the  combined  and 
mRNA/miRNA  panel,  all  of  which  are  adjusted  for  T-stage,  PSA,  and  Gleason  score,  or  using  clinical 
information  only.  Significant  p-values  are  indicated  in  bold. 
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FIGURES 


A 

Training  data  set  (n  =  61,  p  <  0.001) 


Validation  data  set  (n  =  35,  p  =  0.010) 


c 


D 


Training  data  set  (n  =  61,  p  <  0.001) 


Validation  data  set  (n  =  35,  p  =  0.01 3) 


0  50  100  150 


Time  to  recurrence  (months) 


Figure  1:  Prediction  of  biochemical  recurrence  in  all  prostate  cancer  patients  using  two 
biomarker  panels,  adjusted  for  clinical  information.  (A)  Kaplan-Meier  analysis  of  the  training  set 
patients  that  were  separated  based  on  the  mRNA  panel  described  in  Table  2.  (B)  Kaplan-Meier 
analysis  on  the  validation  cases  using  the  mRNA  panel.  (C)  Kaplan-Meier  analysis  of  the  training  set 
using  the  combined  mRNA  and  miRNA  panel  described  in  Table  3.  (D)  Kaplan-Meier  analysis  of  the 
validation  set  using  the  combined  mRNA  and  miRNA  panel. 
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A 

Training  data  set  (n  =  42,  p  <  0.001) 


Time  to  recurrence  (months) 


B 


Validation  data  set  (n  =  19,  p  =  0.027) 
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Training  data  set  (n  =  42,  p  <  0.001) 


D 


Validation  data  set  (n  =  19,  p  =  0.010) 
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Figure  2:  Prediction  of  biochemical  recurrence  in  prostate  cancer  patients  with  a  Gleason  score 
7  using  two  biomarker  panels,  adjusted  for  clinical  information.  (A)  Kaplan-Meier  analysis  of  the 
training  set  of  Gleason  score  7  cases  using  the  mRNA  panel  described  in  Table  2.  (B)  Kaplan-Meier 
analysis  of  the  Gleason  score  7  cases  in  the  validation  set  using  the  mRNA  panel.  (C)  Kaplan-Meier 
analysis  of  the  Gleason  score  7  cases  in  the  training  set  using  the  combined  mRNA  and  miRNA  panel 
described  in  Table  3.  (D)  Kaplan-Meier  analysis  of  the  Gleason  score  7  cases  in  the  validation  set 
using  the  combined  mRNA  and  miRNA  panel. 
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Administrative  hurdles 

When  this  project  was  initiated,  we  anticipated  that  the  IRB  protocols  already  in  place  would 
be  sufficient  for  conducting  the  described  research  project.  However,  review  by  the  DOD 
determined  that  the  IRB  protocols  were  too  general  and  not  sufficiently  specific.  Consequently,  we 
submitted  new,  specific  IRB  protocols  at  Emory  University,  the  Emory/Atlanta  VA  Medical  Center, 
and  at  Sunnybrook  Research  Centre  at  the  University  of  Toronto.  While  we  obtained  IRB  approval 
letters  fairly  quickly  at  Emory  and  the  VA,  approval  from  Sunnybrook  took  much  longer.  The 
difficulty  hinged  on  award  of  the  subcontract,  since  the  Sunnybrook  IRB  would  not  issue  approval 
without  funding,  and  Emory  would  not  issue  the  subcontract  without  the  IRB  approval.  We  were 
stuck  in  a  Catch-22  for  several  months  before  we  finally  obtained  Sunnybrook  IRB  approval  in 
November,  201 0  and  permission  to  begin  work  at  Sunnybrook  was  obtained  in  December,  201 0. 
Unfortunately,  it  took  another  two  months  before  all  of  the  legal  issues  could  be  resolved,  including 
the  fact  that  there  is  no  HIPAA  law  in  Canada,  so  that  the  subcontract  could  be  awarded  to 
Sunnybrook  by  Emory  University.  Now  that  all  of  the  administrative  hurdles  have  been  overcome,  we 
expect  to  begin  receiving  samples  from  Toronto  in  the  very  near  future. 

Progress  at  Emory  University 

In  the  meantime,  once  approval  was  obtained  from  the  DOD  on  August  13,  2010  to 
commence  work  on  the  project  at  Emory,  we  began  immediately  to  work  on  identifying  samples  that 
we  could  use  in  our  validation  study.  We  identified  150  cases  at  the  VA  hospital  between  1990-2000 
that  could  potentially  be  used  for  this  project.  We  were  able  to  locate  slides  and  formalin-fixed 
paraffin  embedded  (FFPE)  blocks  for  100  of  those  cases,  and  Dr.  Oyesiku  identified  regions  of 
cancer  and  benign  tissue  in  slides  for  each  of  them.  These  samples  were  then  submitted  for 
processing  to  obtain  1  mm  tissue  cores.  We  anticipate  that  we  will  have  RNA  ready  for  WG-DASL 
analysis  in  the  next  few  weeks.  We  are  planning  to  examine  additional  cases  at  Emory  and  the  VA 
between  2000-2003  to  identify  samples  for  use  in  Aim  2. 

Platform  issues 

In  our  initial  proposal,  we  planned  to  use  the  lllumina  miRNA  DASL  platform  for  analysis  of 
miRNA  biomarker  expression  levels.  Since  submission  of  our  application  for  initial  review,  lllumina 
has  discontinued  this  platform.  Consequently,  we  are  evaluated  several  different  options  for 
comprehensive  analysis  of  miRNA  levels.  These  options  include  TaqMan  Low  Density  Arrays, 
Affymetrix  miRNA  arrays,  High-Throughput  Genomics  Quantitative  Nuclease  Protection  Assays, 
lllumina  HiSeq  sequencing,  Ion  Torrent  sequencing,  and  Nanostring  sequencing.  There  are 
strengths  and  weaknesses  to  each  of  these  options  that  we  are  currently  scrutinizing.  However,  we 
are  likely  to  use  a  48-plex  multiplexing  method  of  next  generation  sequencing  on  the  lllumina  HiSeq 
platform.  This  should  give  us  deep  coverage  and  high  data  quality  at  an  acceptable  cost.  We  plan 
on  performing  pilot  studies  to  determine  whether  we  can  obtain  acceptable  data  using  FFPE-derived 
RNA  shortly. 
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Key  Research  Accomplishments 


Our  key  research  accomplishments  are  summarized  below: 

•  Identified  a  set  of  10  mRNAs  and  2  miRNAs  predictive  of  recurrence  following  prostatectomy. 

•  Published  a  manuscript  describing  these  biomarkers  of  recurrence. 

•  Obtained  all  necessary  IRB  and  DOD  approvals  to  commence  work. 

•  Identified  1 00  prostate  cancer  cases  at  the  Atlanta/VA  for  the  validation  study. 

•  Marked  tumor  and  benign  areas  to  enable  coring  of  FFPE  blocks. 

•  Initiated  RNA  extraction  of  the  first  100  cases  for  WG-DASL  analysis. 
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Reportable  Outcomes 


Grant  Award  Received 

March  25,  2010 

New  IRB  Protocol  Requested  from  DOD 

May  20,  2010 

New  IRB  Protocol  Approved  by  Emory  University 

June  20,  2010 

Approval  received  from  VA  Research  Committee  to  use  VA  samples 

August  10,  2010 

Permission  to  begin  research  at  Emory  obtained  from  DOD 

August  13,  2010 

Manuscript  describing  initial  biomarkers  submitted  for  publication 

September  24,  2010 

Initial  set  of  100  samples  identified  at  the  Emory/VA 

October  12,  2010 

Abstract  submitted  to  American  Society  for  Investigative  Pathology  (ASIP) 

201 1  meeting  in  Washington,  DC 

November  5,  2010 

New  IRB  Protocol  Approved  by  U.  Toronto/Sunnybrook 

November  15,  2010 

Permission  to  begin  research  at  U. Toronto/Sunnybrook  obtained  from  DOD 

December  21 , 2010 

Revised  manuscript  describing  biomarkers  submitted  for  publication 

December  31 , 2010 

Abstract  for  ASIP  201 1  meeting  selected  for  oral  presentation 

February  7,  2011 

Tissue  blocks  pulled  and  slides  marked  for  100  samples  at  the  VA 

February  18,  2010 

Subaward  Contract  agreed  between  Emory  and  U.  Toronto/Sunnybrook 

February  28,  201 1 

Manuscript  describing  initial  biomarkers  accepted  for  publication  in  AJP1 

March  3,  2011 

Tissue  coring  from  100  VA  FFPE  tissue  blocks  initiated 

March  15,  2011 

14 


Conclusion 


Thus  far  we  have  made  good  progress  on  our  goal  to  validate  biomarkers  of  recurrence  in 
prostate  cancer.  We  have  published  a  manuscript  now  in  press  at  The  American  Journal  of 
Pathology  describing  our  set  of  biomarkers1,  and  will  be  presenting  these  data  orally  at  the  American 
Society  of  Investigative  Pathology  annual  meeting  in  Washington,  DC  on  April  1 1 , 201 1 .  We  have 
initiated  collection  of  samples  at  the  Atlanta  VA  Medical  Center  and  begun  tissue  coring  and  RNA 
extraction.  We  have  faced  some  administrative  hurdles  with  getting  the  project  started  regarding 
IRB  approvals  and  subcontract  awards.  However,  now  that  all  of  these  obstacles  have  been 
overcome,  we  expect  rapid  progress  in  the  next  year  for  data  generation.  In  addition,  the  miRNA 
DASL  platform  has  been  discontinued  by  Ilium ina,  and  we  are  thus  planning  on  transitioning  the 
project  to  next  generation  sequencing  methods,  which  are  likely  to  be  more  accurate  and 
comprehensive.  Because  of  the  delays  in  getting  the  project  initiated,  we  are  requesting  a  no  cost 
one  year  extension  to  change  the  project  from  at  two-year  to  a  three-year  project. 
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Appendices 


See  attached  manuscript  by  Long  et  al1. 
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ABSTRACT 


An  important  challenge  in  prostate  cancer  research  is  to  develop  effective  predictors  of  tumor 
recurrence  following  surgery  in  order  to  determine  whether  immediate  adjuvant  therapy  is  warranted. 
To  identify  bio  markers  predictive  of  biochemical  recurrence,  we  isolated  the  RNA  from  70  formalin- 
fixed  paraffin-embedded  (FFPE)  radical  prostatectomy  specimens  with  known  long  term  outcome  to 
perform  DASL  expression  profiling  with  a  custom-designed  panel  of 522  prostate  cancer  relevant 
genes  that  we  designed.  We  identified  a  panel  of  ten  protein-coding  genes  and  two  miRNA  genes 
(RAD23B,  FBP1,  TNFRSF1A,  CCNG2,  NOTCH3,  ETV1,  BID,  SIM2,  ANXA1,  miR-519d,  and  miR- 
647)  that  could  be  used  to  separate  patients  with  and  without  biochemical  recurrence  (p  <  0.001),  as 
well  as  for  the  subset  of42  Gleason  score  7  patients  (p  <  0.001).  We  performed  an  independent 
validation  analysis  on  40  samples  and  found  that  the  biomarker  panel  was  also  significant  at  prediction 
of  recurrence  for  all  cases  (p  =  0.013)  and  for  a  subset  of  19  Gleason  score  7  cases  (p  =  0.010),  both  of 
which  were  adjusted  for  relevant  clinical  information  including  T-stage,  PSA  and  Gleason  score. 
Importantly,  these  biomarkers  could  significantly  predict  clinical  recurrence  for  Gleason  score  7 
patients.  These  bio  markers  may  increase  the  accuracy  of  prognostication  following  radical 
prostatectomy  using  formalin- fixed  specimens. 
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INTRODUCTION 


Prostate  cancer  remains  the  most  common  non-cutaneous  cancer  diagnosed  for  U.S.  males,  and 
ranks  second  among  tumor  site-specific  mortality,  with  estimates  for  2009  at  over  192,000  new  cases 
and  27,000  deaths1.  The  majority  of  patients  with  prostate  cancer  are  clinically  asymptomatic  with 
early-stage,  organ-confined  disease,  and  in  lact,  more  than  50%  of  men  who  reach  the  age  of  80 
develop  clinically  insignificant  prostate  cancer.  However,  a  subpopulation  of  prostate  cancer  patients 
progress  to  highly  aggressive,  androgen- independent  metastatic  disease,  which  is  inevitably  fatal.  One 
of  the  important  challenges  in  current  prostate  cancer  research  is  to  develop  effective  methods  to 
determine  whether  a  patient  is  likely  to  progress  to  aggressive,  metastatic  disease  in  order  to  aid 
clinicians  in  deciding  on  the  appropriate  course  of  treatment.  Bio  marker  assays  that  could  predict 
progression  and  metastasis  for  prostate  cancer  patients  would  be  of  great  utility  in  aiding  clinical 
management  of  this  large  patient  population.  An  important  challenge  in  prostate  cancer  research  is  to 
develop  effective  predictors  of  tumor  recurrence  following  surgery  in  order  to  determine  whether 
immediate  adjuvant  therapy  is  warranted.  Thus,  biomarkers  that  could  predict  the  likelihood  of 
success  for  surgical  therapies  would  be  of  great  clinical  significance. 

In  the  past  few  years,  enormous  progress  has  been  made  in  developing  technologies  to  exploit 
formalin- fixed  paraffin-embedded  (FFPE)  tumor  tissue  samples  for  gene  expression  analysis.  The 
DASL  (cDNA- media  ted  Annealing,  Selection,  extension  and  Ligation)  assay  is  a  unique  expression 
profiling  platform  that  is  based  upon  massively  multiplexed  RT-PCR  applied  in  a  microarray  format, 
that  allows  for  the  determination  of  expression  of  RNA  isolated  from  96  FFPE  tumor  tissue  samples  in 
a  high  throughput  format  2’ 3. 

Here,  we  have  identified  biomarkers  predictive  of  recurrence  by  expression  profiling  archived 
FFPE  tumor  samples  using  both  a  custom  panel  of  prostate  cancer  associated  mRNA  genes  and  a  panel 
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of  microRNA  genes.  These  biomarkers  were  developed  on  a  training  set  of  70  patients  (29  with 
biochemical  recurrence  and  41  controls)  and  validated  on  an  independent  set  of40  samples  (13  with 
biochemical  recurrence  and  27  controls)  and  were  able  to  significantly  discriminate  between  patients 
with  and  without  biochemical  recurrence  following  radical  prostatectomy.  Moreover,  these 
bio  markers  were  able  to  discriminate  biochemical  recurrence  in  patients  with  Gleason  score  7,  for 
whom  outcome  is  particularly  difficult  to  predict. 
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MATERIALS  AND  METHODS 


Patient  Samples 

In  the  initial  training  set,  70  cases  were  used  (29  with  biochemical  recurrence  and  41  controls), 
45  patients  from  Sunnybrook  Health  Science  Center  (Toronto,  ON),  and  25  patients  from  Emory 
University.  The  45  cases  of  paraffin-embedded  tissue  samples  from  Toronto  were  drawn  from  men 
who  underwent  radical  prostatectomy  as  the  sole  treatment  for  clinically  localized  prostate  cancer 
(PCa)  between  1998  and  2006.  The  clinical  data  includes  multiple  clinicopatho logic  variables  such  as 
prostate  specific  antigen  (PSA)  levels,  histologic  grade  (Gleason  score),  tumor  stage  (pathologic  stage 
category  for  example;  organ  confined,  pT2;  or  with  extra-prostatic  extension,  pT3a;  or  with  seminal 
vesicle  invasion,  pT3b),  and  biochemical  recurrence  rates.  For  the  cases  from  Emory  University,  both 
the  training  set  (25  cases)  and  validation  set  (40  cases)  FFPE  samples  were  also  selected  from  a  screen 
of  over  a  thousand  patients  through  an  IRB-approved  retrospective  study  at  Emory  University  of  men 
who  had  undergone  radical  prostatectomy  between  1990  and  1994.  Those  who  were  included  met 
specific  inclusion  criteria,  had  available  tissue  specimens,  documented  longterm  follow-up  and 
consented  to  participate  or  were  included  by  IRB  waiver.  The  cases  were  assigned  prostate  ID  numbers 
to  protect  their  identities.  These  patients  did  not  receive  neo-adjuvant  or  concomitant  hormonal 
therapy.  Their  demographic,  treatment  and  long-term  clinical  outcome  data  have  been  collected  and 
recorded  in  an  electronic  database.  Clinical  data  recorded  include  PSA  measurements,  radiological 
studies  and  findings,  clinical  findings,  tissue  biopsies  and  additional  therapies  that  the  subjects  may 
have  received.  C  linical  data  associated  with  the  samples  used  in  this  study  are  given  in  Supplementary 
Table  SI  (see  Supplemental  Table  SI  at  http://ajp.amjpathol.org). 

RNA  Preparation 
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Tissue  cores  (1  mm)  were  used  for  RNA  preparation  rather  than  sections  because  of  the 
heterogeneity  of  samples  and  the  opportunity  for  obtaining  cores  with  very  high  percentage  tumor 
content.  H&E  stained  slides  were  reviewed  by  a  board  certified  uro logic  pathologist  (AOO)  to  identify 
regions  of  cancer  to  select  corresponding  areas  for  cutting  of  cores  from  paraffin  blocks.  Total  RNA 
was  prepared  at  the  Emory  Biomarker  Service  Center  from  FFPE  cores  as  previously  described  4,  using 
the  Ambion  Recoverall  MagMax  methodology  in  96- we  11  format  on  a  MagMax  96  Liquid  Handler 
Robot  (Life  Technologies,  Carlsbad,  CA).  FFPE  RNA  was  quantitated  quantitated  using  an  Nanodrop 
spectrophotometer  (Wilmington,  DE),  and  tested  for  RNA  integrity  and  quality  by  Taqman  analysis  of 
the  RPL13a  ribosomal  protein  on  a  HT7900  real-time  PCR  instrument  (Applied  Biosystems,  Foster 
City,  CA).  Samples  with  sufficient  yield  (>500  ng),  A260/A280  ratio  >  1.8  and  RPL13a  Ct  values  less 
than  30  cycles  were  used  for  miRNA  and  DASL  profiling. 

Custom  Prostate  Cancer  DASL  Assay  Pool  (DAP) 

The  DASL  assay  enables  quantitation  of  expression  of  up  to  1,536  probes  using  RNA  isolated 
from  archived  FFPE  tumor  tissue  samples  in  a  high  throughput  format 2’ 3.  Data  from  multiple 
publicly  available  gene  expression  datasets  5~8,  along  with  genes  involved  in  prostate  cancer 
progression  based  on  current  understanding  of  the  disease  6’ 9 ,  were  distilled  to  develop  a  highly 
predictive  set  of 522  genes  for  use  in  the  DASL  assay.  Due  to  specific  probe  design  considerations, 
this  panel  had  three  probes  for  497  genes,  two  probes  for  20  genes,  and  a  single  probe  for  five  genes, 
two  of  which  were  specific  to  TMPRSS2-ERG  and  TMPRSS2-ETV1  fusions  transcripts.  The  unique 
combination  of  genes  was  optimized  for  performance  in  the  DASL  assay  using  stringent  criteria  that 
predicts  excellent  performance  of  the  primer  sets.  The  panel  includes  genes  found  to  be  correlated 
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with  Gleason  score  in  Liu  et  al  ,  Bibikova  et  al  ,  True  et  al  ,  LaPointe  et  al  ,  and/or  Singh  et  al  . 
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It  also  includes  prognostic  markers  from  Dhanasekaran  et  al 5  and  Yu  et  al 14,  and  genes  associated 
with  metastasis  in  Varambally  et  al 6 .  In  addition,  a  number  of  genes  known  from  other  studies  to  be 
critical  in  prostate  cancer  such  as  NKX3.1,  PTEN,  and  the  androgen  receptor  are  all  included  in  the 
panel.  Other  genes  that  play  important  roles  in  the  Wnt,  Hedgehog,  TGFp,  Notch,  MAPK  and  PI3K 
pathways  are  also  present  in  this  gene  set.  Finally,  primer  sets  that  detect  chromosomal  translocations 
in  ERG  9 ,  ETV1  15,  and  ETV4  16  are  also  included  in  this  panel  The  custom  pro  state  cancer  panel  list 
of 522  candidate  genes  (see  Supplemental  Table  S2  at  http://ajp.amjpathol.org)  was  submitted  to 
Illumina  for  synthesis.  The  optimal  oligonucleotide  sequence  for  each  of  the  1,536  gene  probes  was 
determined  using  an  oligonucleotide  scoring  algorithm.  The  oligonucleotide  pool  or  DASL  Assay 
Pool  (DAP)  was  synthesized  by  Illumina  for  use  with  the  96- well  Universal  Array  Matrix  (UAM). 

The  DASL  (cDNA-mediatedAiinealing, Selection,  extension  and  Ligation)  assay 

The  DASL  assay  was  performed  with  our  522- gene  custom  designed  human  prostate  cancer 
panel  using  200  ng  of  input  RNA  at  the  Emory  Bio  marker  Service  Center,  Emory  University  according 
to  the  manufacturer’s  protocols.  Samples,  including  technical  replicates  (2,  3  or  4)  were  hybridized  on 
UAMs,  and  scanned  using  the  BeadStation  500  Instrument  (Illumina  Inc.).  For  miRNA  DASL  assays, 
the  human  miRNA  v2  DASL  panel  (Illumina,  Inc.),  which  allows  for  the  determination  of  expression 
of  1,146  human  miRNAs  (>  97%  coverage  of  miRBase  release  12)  was  used.  These  data  are  available 
at  GEO  under  accession  number  GSE26367. 

Data  Analysis 

DASL  fluorescent  intensities  were  interpreted  in  GenomeStudio,  quantile  normalized,  and 
exported  for  me  ta- ana  lysis.  Average  signal  intensity,  genes  detected  (p- value  =  0.01),  background,  and 
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noise  (standard  deviation  of  background)  were  analyzed  for  trends  by  plate,  row,  and  column.  The  two 
endpoints  of  interest  were  postoperative  biochemical  recurrence,  defined  as  two  detectable  PSA 
readings  (>0.2  ng/ml),  and  clinical  recurrence,  defined  as  evidence  of  local  or  metastatic  disease.  The 
primary  outcome  of  interest  was  tune  to  biochemical  recurrence  following  surgery.  A  local  recurrence 
was  defined  as  recurrence  of  cancer  in  the  prostatic  bed  that  was  detected  by  either  a  palpable  nodule 
on  digital  rectal  examination  (DRE)  and  subsequently  verified  by  a  positive  biopsy,  and/or  a  positive 
imaging  study  (Prostascint  or  CT  scan)  accompanied  by  a  detectable  postoperative  PSA  result  and  lack 
of  evidence  for  metastases.  Also,  patients  whose  PSA  level  decreased  following  adjuvant  pelvic 
radiation  therapy  for  elevated  postoperative  PSA  were  considered  as  local  recurrence  cases.  A 
recurrence  with  metastases  was  defined  as  a  positive  imaging  study  indicating  presence  of  a  tumor 
outside  of  the  pro  static  bed. 

To  identify  important  bio  markers  and  build  and  evaluate  prediction  models  for  prostate  cancer 
recurrence,  we  adopted  the  following  strategy.  In  the  training  step,  the  prediction  model  was  built 
based  on  the  time  to  biochemical  recurrence.  Specifically,  we  first  fit  a  univariate  Cox  proportional 
hazard  (PH)  model  for  each  individual  oligonucleotide  probe  using  the  training  data  set,  and  a  set  of 
important  mRNA  and  miRNA  probes  were  then  preselected  based  on  a  false  discovery  rate  (FDR) 
threshold  of  0.30.  Next,  to  identify  the  optimal  prediction  score  based  on  the  preselected  probes,  we  fit 
a  lasso  Cox  PH  model 17 ’ 18  using  the  training  data  set,  where  the  tuning  parameter  for  lasso  was 
selected  using  a  leave-one-out  cross-validation  technique  18.  The  lasso  Cox  PH  model  was  fitted  first 
using  the  set  ofpreselected  mRNA  probes  only  and  then  using  the  complete  set  ofpreselected  mRNA 
and  miRNA  probes,  resulting  in  an  optimal  mRNA  panel  and  an  optimal  combined  mRNA/miRNA 
panel,  respectively.  Based  on  each  bio  marker  panel,  a  final  prediction  model  for  recurrence  was  built 
to  also  incorporate  relevant  clinical  bio  markers,  namely,  T-stage,  PSA  and  Gleason  score,  through 
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fitting  Cox  PH  models.  For  comparison,  we  also  built  a  prediction  model  using  only  clinical 
information,  namely,  T-stage,  PSA  and  Gleason  score,  through  fitting  a  Cox  PH  model. 

To  evaluate  and  validate  the  final  prediction  models  obtained  from  the  training  phase,  79 
samples  from  40  patients  were  used  and  replicate  samples  from  the  same  patient  were  again  averaged 
to  generate  a  single  average  signal  for  each  patient.  Each  prediction  model  from  the  training  phase  was 
used  to  generate  a  predictive  score  for  each  subject  in  the  validation  data  set,  and  subjects  were 
subsequently  divided  into  high  and  low  scoring  groups  using  the  median  predictive  score.  Kaplan 
Meier  analysis  was  performed  to  compare  the  time  to  biochemical  recurrence,  between  high  (poor 
score)  and  low  (good  score)  risk  groups,  and  the  statistical  significance  was  determined  using  the  log- 
rank  test.  Similarly,  we  also  evaluated  the  final  model  that  uses  the  combined  mRNA/miRNA 
panelfor  predicting  time  to  clinical  recurence  in  both  training  and  validation  data  sets. 

Missing  data  are  present  in  this  study,  in  particular,  for  clinical  recurrence,  PSA  and  T-stage 
data.  We  adopted  the  available- case  approach19  in  our  analyses  and  the  sample  sizes  used  in  each  step 
of  building  and  evaluating  prediction  models  maybe  less  than  the  total  sample  size. 
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RESULTS 


Custom  Prostate  DASL  profiling 

We  performed  DASL  expression  profiling  with  our  custom-designed  prostate  cancer  panel  (see 
Materials  and  Methods  section)  and  the  Tllumina  DASL  microRNA  (miRNA)  panel  on  70 
prostatectomy  patient  samples  to  identify  biomarkers  predictive  of  recurrence.  An  independent 
validation  profiling  experiment  was  performed  on  40  additional  samples.  MiRNA  probes  were  filtered 
to  retain  only  those  that  were  present  on  the  miRNA  microarrays  used  for  both  the  training  and 
validation  sets,  reducing  the  total  number  of  probes  examined  to  403  miRNA  probes.  The  training  set 
included  29  cases  with  observed  biochemical  PSA  recurrence  (median  time  to  recurrence  =  19 
months),  and  41  cases  censored,  i.e.,  without  observed  recurrence  during  follow-up  (median  follow-up 
time  =  83.0  months).  A  summary  of  the  clinical  characteristics  of  the  training  and  validation  sets  of 
samples  is  provided  in  Table  1.  The  complete  dataset  for  the  combined  mRNA  and  miRNA  data  are 
provided  in  Supplementary  Table  S3  for  the  training  set  and  Supplementary  Table  S4  for  the  validation 
set  (see  http://ajp.amjpathol.org). 

Integrated  DASL  bio  marker  analysis 

After  fitting  a  univariate  Cox  proportional  hazard  (PH)  model  for  each  individual  probe  using 
the  training  data,  a  set  of  27  important  probes  were  preselected  based  on  an  FDR  threshold  of  0.30  (see 
Supplementary  Table  S5  at  http://ajp.amjpathol.org).  Next,  to  identify  the  optimal  prediction  score 
based  on  the  preselected  probes,  we  fit  a  lasso  Cox  proportional  hazard  (PH)  model17, 18  first  using  the 
set  of  25  preselected  mRNA  probes  only,  resulting  in  a  panel  of  nine  protein- coding  genes  shown  in 
Table  2  (RAD23B,  FBP1,  TNFRSF1A,  NOTCH3,  ETV1,  BID ,  SIM2,  ANXA1,  and  BCL2 ).  A  final 
prediction  model  was  then  built  to  include  the  predictive  score  based  on  this  panel  of  nine  mRNA 
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biomarkers  as  well  as  the  relevant  clinical  biomarkers  including  T-stage,  PSA  and  Gleason  score, 
which  could  be  used  to  predict  recurrence  following  radical  prostatectomy.  Kaplan-Meier  analysis 
(Figure  1A)  demonstrated  that  these  probes  could  significantly  discriminate  patients  at  higher  and 
lower  risk  of  recurrence  by  the  log  rank  test  (p  <  0.001).  We  next  applied  the  final  predictive  model 
developed  on  the  training  set  to  the  validation  set,  a  separate,  independent  DASL  profiling  experiment 
performed  on  a  different  day.  Kaplan-Meier  analysis  (Figure  IB)  on  this  validation  set  determined  that 
the  model  could  discriminate  patients  at  higher  and  lower  risk  of  recurrence  (p  =  0.010). 

Subsequently,  we  repeated  the  above  training  procedure  using  the  complete  set  of  27 
preselected  mRNA  and  miRN A  probes,  and  we  identified  an  optimal  panel  often  mRN As  and  two 
micro RNAs  (Table  3)  and  built  a  final  prediction  model  for  prostate  cancer  biochemical  recurrence, 
which  again  included  relevant  clinical  bio  markers.  Kaplan-Meier  analysis  and  the  log-rank  test 
determined  that  this  panel  could  also  significantly  discriminate  patients  at  higher  and  lower  risk  of 
recurrence  both  in  the  training  set  (p  <  0.001,  Figure  1C)  and  in  the  validation  set  (p  =  0.013,  Figure 
ID). 


Prediction  of  Cases  with  a  Gleason  Score  7 

Prediction  of  recurrence  for  patients  with  a  Gleason  score  7  is  particularly  difficult.  In  order  to 
address  this  issue,  we  applied  the  bio  marker  panels  to  the  subset  of  cases  in  the  training  and  validation 
sets  that  had  a  Gleason  score  7.  The  prediction  model  based  on  the  nine- mRNA  panel  was  significant 
at  discriminating  biochemical  recurrence  in  Gleason  score  7  cases  in  both  the  training  set  (p  <  0.001, 
Figure  2A)  and  the  validation  set  (p  =  0.027,  Figure  2B).  For  the  prediciton  model  based  on  the 
combined  panel  of  ten  mRNAs  and  two  miRN  As  in  Table  3,  the  predictive  value  was  again  significant 
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for  both  the  training  set  (p  =  <  0.001,  Figure  2C)  and  the  validation  set  (p  =  0.010,  Figure  2D).  A 
summary  of  the  p- values  for  predicting  biochemical  recurrence  is  given  in  Table  4.  In  all  cases,  the 
prediction  models  that  use  one  of  the  two  gene  bio  marker  panels  plus  clinical  information  outperforms 
the  prediction  model  using  only  clinical  information. 

Analysis  of  clinical  recurrence 

Although  most  patients  who  have  clinical  recurrence  following  prostatectomy  also  have 
biochemical  recurrence,  there  is  a  significant  population  ofpatients  with  biochemical  recurrence  who 
do  not  have  clinically  significant  recurrences  observed  during  then*  follow-ups.  To  evaluate  our 
biomarker  panel  of  biochemical  recurrence  for  predicting  the  clinical  recurrence,  we  tested  the 
prediction  model  based  on  the  combined  mRNA/miRNA  panel  in  the  same  training  and  validation 
samples  using  their  clinical  recurrence  outcome  data.  Unfortunately,  clinical  recurrence  data  was 
lacking  on  some  of  the  samples,  and  the  total  number  of  samples  used  in  the  training  set  was  reduced. 
In  the  training  data,  the  combined  mRNA/miRNA  panel  was  highly  significant  for  predicting  clinical 
recurrence  in  all  patients  (p=0.002)  as  well  as  in  the  subset  of  patients  with  a  Gleason  score  7 
(p=0.004);  in  the  validation  data,  it  was  also  significant  for  predicting  recurrence  inpatients  with  a 
Gleason  score  7  (p=0.023)  and  trended  towards  significance  in  all  patients  (p=0.078).  A  summary  of 
the  p-values  for  predicting  clinical  recurrence  is  given  in  Table  5.  In  all  cases,  the  prediction  model 
that  uses  the  combined  mRNA  and  miRNA  panel  plus  the  clinical  information,  again,  outperforms  the 
prediction  model  that  uses  only  the  clinical  information. 

We  also  performed  an  analysis  to  construct  a  predictive  set  of  bio  markers  based  on  the  clinical 
recurrence  data  instead  of  biochemical  recurrence.  Only  three  probes  passed  the  initial  preselection 
step  for  the  univariate  Cox  PH  modeling,  all  corresponding  to  the  ETV1  gene,  which  is  likely  due  to 
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the  considerably  fewer  number  of  clinical  recurrences  in  the  training  set  as  well  as  the  smaller  total 


sample  size.  Furthermore,  the  prediction  model  built  on  this  set  of  gene  bio  markers  did  not  perform  as 
well  as  the  models  built  on  biochemical  recurrence  (data  not  shown). 
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DISCUSSION 


In  the  past  few  years,  enormous  progress  has  been  made  in  developing  technologies  to  exploit 
FFPE  tumor  tissue  samples  for  gene  expression  and  proteomic  analysis.  The  use  of  FFPE  tissues  as  a 
starting  material  is  attractive  because  this  approach  should  make  bio  markers  identified  in  this  way 
much  easier  to  translate  into  widespread  clinical  practice.  DASL  profiling  makes  it  possible  to  define 
gene  sets  using  FFPE  prostate  cancer  tissues  that  could  have  potential  prognostic  and  predictive  value. 
For  example,  the  DASL  assay  has  been  used  recently  to  identify  a  16-gene  set  that  correlates  with 
prostate  cancer  relapse  1 1 .  There  was  no  overlap  between  our  panel  of  ten  mRNA  and  two  miRNA 
bio  markers  described  here  and  the  previously  described  16-  gene  panel  even  though  ten  of  the  genes  in 
the  16-gene  panel  previously  reported  were  included  in  our  522  custom  prostate  DASL  panel.  When 
we  analyzed  the  performance  of  the  probes  corresponding  to  those  ten  mRNAs  in  our  dataset,  we 
found  that  they  were  not  able  to  significantly  discriminate  patients  at  higher  or  lower  risk  of 
recurrence.  In  this  previous  study,  the  gene  signature  selection  and  prediction  model  building  were 
performed  in  separate  steps  and  the  signature  selection  was  based  on  the  correlation  between  the  gene 
expression  and  Gleason  score  rather  than  between  the  gene  expression  and  time  to  biochemical 
recurrence;  our  analytic  approach  overcomes  these  limitations.  Specifically,  our  approach  of  building 
(training)  prediction  models  takes  advantage  of  recent  advancement  in  regularized  regression  models 
for  survival  outcomes17, 18;  regularized  regression  models  can  achieve  simultaneous  feature  selection 
and  model  estimation  and  avoid  model  overfitting,  leading  to  better  prediction  performance.  Our  use  of 
a  pre-selection  step  is  similar  to  the  recently  proposed  sure  independence  screening  methods20, 21 , 
which  have  been  shown  to  achieve  better  performance  in  the  presence  of  high- dimensional  data  for 
survival  analysis  compared  to  regularized  regression  without  a  pre-selection  step22. 

Two  other  recent  studies  have  employed  DASL  profiling  to  prostate  cancer,  but  not  detected 
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any  signature  that  improved  upon  clinical  models  in  validation  sets23, 24 .  While  these  studies  used  large 
cohorts  with  long-term  follow-up,  they  examined  different  panels  of  mRNA  transcripts  and  did  not 
include  probes  corresponding  to  miRNA  genes.  Moreover,  these  earlier  studies  suggested  that  tumor 
heterogeneity  may  play  an  important  role  in  confounding  signature  identification.  For  our  study  of 
prostatectomy  specimens,  we  identified  the  most  prominent  tumor  lesion,  and  used  a  tissue  core 
sample  from  that  region  to  minimize  stromal  contributions  and  tumor  heterogeneity. 

In  our  twelve- gene  predictive  bio  marker  panel,  nine  of  the  genes  are  positively  associated  with 
recurrence,  and  three  are  negatively  associated  with  recurrence.  The  nine  genes  positively  associated 
with  recurrence  included  miR-519d,  Notch  ho  mo  log  3  (Notch3),  Fructose-  1,6-bisphosphatase  1 
(FBP1),  ETS  variant  gene  1  (ETV1),  BH3  interacting  domain  death  agonist  (BID),  Single-Minded 
homo  log  2  (SIM2),  RAD23  homo  log  B  (RAD23B),  LETM1  domain  containing  1  (LETMD1),  and 
Cyclin  G2  (CCNG2).  Little  is  known  about  miR-519d  other  than  it  may  be  associated  with  obesity25. 
NOTCH3  is  one  of  four  Notch  family  receptors  in  humans,  and  Notch  signaling  has  been  shown  to  be 
important  for  prostate  cancer  cell  growth,  migration,  and  invasion26, 27  as  well  as  normal  prostate 
development28,29.  FBP1  is  expressed  in  the  prostate  and  is  involved  in  gluconeogenesis30.  The 
identification  of  this  metabolic  enzyme  as  a  bio  marker  of  recurrence  is  initially  surprising,  but  given 
the  recent  identification  of  isocitrate  dehydrogenase  1  (IDH1)  mutations  in  glioblastoma31,  and  the  fact 
that  FBP1  was  overexpressed  in  independent  microarray  analyses  ofprostate  cancers7, 32,  the  potential 
ofFBPl  as  a  biomarker  should  not  be  underestimated.  ETV1  is  well  established  as  one  of  the 
commonly  recurrent  translocations  found  in  prostate  cancers9, 15,  and  has  been  used  in  clinical  models 
of  recurrence  following  prostatectomy33.  BID  is  a  pro-apoptotic  protein  that  binds  to  BCL2  and 
potentiates  apoptotic  responses  upon  cleavage  in  response  to  tumor  necrosis  factor  alpha  (TNFa)  and 
other  death  receptors34, 35.  SIM2  was  identified  as  a  potential  biomarker  ofprostate  cancer  in200236 
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and  later  independently  confirmed  by  Halvorsen  et  a/37  and  Arredouani  et  a/38.  SIM2  functions  as  a 
transcription  lactor  that  represses  the  proapoptotic  gene  BNIP339.  RAD23B  plays  a  critical  role  in 
DNA  damage  recognition  and  nucleotide  excision  repair40,  as  well  as  inhibiting  MDM2-mediated 
degradation  of  the  p53  tumor  suppressor41.  LETMD1  (also  known  as  HCCR)  is  an  oncogene  that  is 
induced  by  Wnf42  and  PI3K/AKT  signaling43,  inhibits  p53  function44,  and  is  a  biomarker  for 
hepatocellular45  and  breast46  cancers.  Cyclin  G2  is  an  atypical  eye lin  that  is  induced  by  DNA 
damage47  in  a  p53- independent  manner,  as  well  as  by  PI3K/AKT/FOXO  signals48,  and  induces  p53- 
dependent  cell  cycle  arrest49. 

The  three  genes  in  the  predictive  biomarker  panel  negatively  associated  with  recurrence  were 
miR-647,  the  TNFa  receptor  (TNFRSF1A),  and  annexin  A1  (ANXA1).  While  little  is  known  about 
miR-647,  TNFRSF1A  (also  known  as  TNFR1)  mediates  pro- apoptotic  responses  to  TNFa  ligand50,51. 
Annexin  A1  expression  is  reduced  in  early  onset  prostate  cancer52  and  high-grade  prostatic 
intraepithelial  neoplasia51.  ANXA1  plays  important  roles  in  vesicle  trafficking  and  reduced  ANXA1 
promotes  EMT  and  metastasis  54,  and  upregulates  autocrine  IL-6  signaling55.  Thus,  as  a  whole,  this 
panel  ofbio markers  appears  to  reflect  changes  in  DNA  stability,  PI3K  signaling,  p53  activity, 
apoptosis,  and  differentiation  consistent  with  more  aggressive  disease. 

Although  this  study  goes  beyond  a  pilot  study,  enhanced  by  selection  of  samples  from  multiple 
institutions,  the  number  of  specimens  tested  is  still  relatively  small.  Re-analysis  of  our  data  using  only 
the  Emory  samples  for  the  training  set  did  not  identify  any  significant  probes,  likely  due  to  the 
substantially  smaller  sample  size.  Thus,  while  the  performance  of  our  panel  of  biomarkers  is 
significant,  even  for  Gleason  score  7  patients,  future  studies  beyond  the  scope  of  this  work  will  be 
necessary  to  perform  independent  validation  on  much  larger  sample  sets  with  greater  statistical  power. 
Moreover,  it  is  now  feasible  to  perform  DASL  assays  on  virtually  the  entire  genome,  in  an  assay  that 
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queries  24,526  transcripts  derived  from  the  RefSeq  database.  Future  studies  will  test  combined  mRNA 
and  miRNA  bio  marker  panels,  and  query  the  entire  genome  to  determine  if  other  bio  marker  panels  can 
achieve  even  greater  success  in  prediction  ofbiochemical  and  clinical  recurrence  of  prostate  cancer. 
Planned  larger  scale  validation  studies  will  determine  whether  these  bio  markers  are  predictive  for 
Gleason  score  7  cases,  and  their  utility  at  predicting  clinical  as  well  as  biochemical  recurrence. 
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TABLES 


Training 

Training 
Set  (No 

Training 

Validation 

Validation 

Validation 

Set  (Total) 

BCR) 

Set  (BCR) 

Set  (Total) 

Set  (No  BCR) 

Set  (BCR) 

Number  Cases 

70 

41 

29 

40 

27 

13 

Clinical 

Recurrence 

8 

0 

8 

11 

0 

11 

No  Clinical 
Recurrence 

57 

41 

16 

29 

27 

2 

Median  Time 

F/U  (months) 

84 

83 

81 

74 

75 

73 

Median  T ime  to 
BCR  (months) 

19 

N/A 

19 

14 

N/A 

14 

Median  T ime  no 
BCR  (months) 

48 

83 

19 

34.5 

56 

14 

Gleason  Score 
(Avg  +/-  SD) 

6.9  +/-  0.6 

6.7  +/-  0.6 

7.0  +/-  0.6 

7.0+/-  0.8 

6.8+/-  0.7 

7.4  +/-  1 

PSA 

(Avg  +/-  SD) 

9.2  +/-  5.4 

8.7  +/-  6.4 

9.9+/- 3.8 

12.7  +/-  8.4 

12.4  +/-  9.9 

13.1+/-  5.3 

Age 

(Avg  +/-  SD) 

61.9+/- 7.7 

61.2  +/-  7.7 

62.9  +/-  7.8 

63.6+/-  8.4 

63.5  +/-  8.3 

64  +/-  8.9 

Table  1:  A  summary  of  the  clinical  characteristics  of  the  training  and  validation  sets  of  patient 
samples.  (BCR  =  Biochemical  Recurrence,  F/U  =  follow  up,  PSA  =  prostate  specific  antigen,  SD  = 
standard  deviation). 


Symbol 

Description 

Coefficient 

References 

RAD23B 

RAD23  homologB 

0.152 

40,41 

FBP1 

Fructose- 1,6-bisphosphatase  1 

Tumor  Necrosis  Factor  Receptor  Superfamily, 

0.310 

7,30,32 

TNFRSF1A 

Member  1A 

-0.560 

50,  5 1 

NOTCH3 

Notch  ho  mo  log  3 

0.426 

26,27 

ETV1 

Ets  Variant  Gene  1  (ETV1) 

0.157 

9,  15 

BID 

BH3  Interacting  Domain  Death  Agonist  (BID) 

0.248 

34,35 

SIM2 

Single-Minded  Homo  log  2 

0.043 

36-38,56 

ANXA1 

Annexin  A1 

-0.185 

52-55 

BCL2 

B-cell  CLL/lymphoma  2 

0.028 

57,58 

Table  2:  Nine-gene  predictor  of  prostate  cancer  recurrence  following  surgery.  Coefficient  is  derived 
from  the  lasso  Cox  proportion  hazards  model  and  was  used  for  computing  the  predictive  score. 
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Positive  coefficients  indicate  a  positive  association  with  recurrence,  and  negative  coefficients  a 
negative  association  with  recurrence. 


Symbol 

Description 

Coefficient 

References 

RAD23B 

RAD23  homo  log  B 

0.070 

40,41 

FBP1 

Fructose- 1,6-bisphosphatase  1 

0.251 

7,30,32 

TNFRSF1A 

Tumor  necrosis  factor  receptor  superfamily,  member 

1A  -0.588 

50,51 

CCNG2 

Cyclin  G2 

0.008 

47-49 

hsa-miR-647 

hsa-miR-647 

-0.318 

LETMD1 

LETM1  domain  containing  1 

0.063 

42-44,  46 

NOTCH3 

Notch  homolog  3 

0.367 

26,27 

ETV1 

ETS  variant  gene  1  (ETV 1) 

0.179 

9,  15 

hsa-miR- 

519d 

hsa-miR-5 1 9d 

0.551 

25 

BID 

BH3  interacting  domain  death  agonist  (BID) 

0.128 

34,35 

SIM2 

Single-minded  homolog  2 

0.124 

36-38,  56 

ANXA1 

Annexin  A1 

-0.143 

52-55 

Table  3:  Twelve-gene  predictor  ofprostate  cancer  recurrence 

following  surgery 

using  ten  mRNAs  and 

two  micro RNAs.  Coefficient  is  derived  from  the  lasso  Cox  proportion  hazards  model  and  was  used  for 
computing  the  predictive  score.  Positive  coefficients  indicate  a  positive  association  with  recurrence, 
and  negative  coefficients  a  negative  association  with  recurrence. 
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Training  Set 

mRNA  panel 

Combined 

mRNA/miRNA  panel 

Clinical  Information 
Only 

All  Cases  (n=61) 

<0.001 

<0.001 

0.096 

Gleason  score  7 
(n=42) 

<0.001 

<0.001 

0.641 

Validation  Set 

mRNA  panel 

Combined  mRNA/miRNA 
panel 

Clinical  Information 

Only 

All  Cases  (n=35) 

0.010 

0.013 

0.020 

Gleason  score  7 
(n=19) 

0.027 

0.010 

0.028 

Table  4:  Summary  ofp-values  (Logrank  test)  of  prediction  of  biochemical  recurrence  on  training  and 
validation  sets  for  the  entire  dataset  and  the  subset  of  Gleason  score  7  cases  using  two  bio  marker 
panels,  all  of  which  are  adjusted  forT-stage,  PSA,  and  Gleason  score,  or  using  clinical  information 
only.  Significant  p- values  are  indicated  in  bold. 


Training  Set 

Combined  mRNA/miRNA  panel 

Clinical  Information  Only 

All  Cases  (n=56) 

0.002 

0.262 

Gleason  score  7  (n=37) 

0.004 

0.136 

Validation  Set 

Combined  mRNA/miRNA  panel 

Clinical  Information  Only 

All  Cases  (n=35) 

0.078 

0.193 

Gleason  score  7  (n=19) 

0.023 

0.080 

Table  5:  Summary  ofp-values  (Logrank  test)  of  prediction  of  clinical  recurrence  on  training  and 
validation  sets  for  the  entire  dataset  and  the  subset  of  Gleason  score  7  cases  using  the  combined  and 
mRNA/miRNA  panel,  all  of  which  are  adjusted  for  T-stage,  PSA,  and  Gleason  score,  or  using  clinical 
information  only.  Significant  p- values  are  indicated  in  bold. 


25 


FIGURE  LEGENDS 


Figure  1:  Prediction  of  biochemical  recurrence  in  all  prostate  cancer  patients  using  two 
biomarker  panels,  adjusted  for  clinical  information.  (A)  Kaplan-Meier  analysis  of  the  training  set 
patients  that  were  separated  based  on  the  mRNA  panel  described  in  Table  2.  (B)  Kaplan-Meier 
analysis  on  the  validation  cases  using  the  mRNA  panel.  (C)  Kaplan-Meier  analysis  of  the  training  set 
using  the  combined  mRNA  and  miRNA  panel  described  in  Table  3.  (D)  Kaplan-Meier  analysis  of  the 
validation  set  using  the  combined  mRNA  and  miRNA  panel. 

Figure  2:  Prediction  of  biochemical  recurrence  in  prostate  cancer  patients  with  a  Gleason  score  7 
using  two  biomarker  panels,  adjusted  for  clinical  information.  (A)  Kaplan-Meier  analysis  of  the 
training  set  of  Gleason  score  7  cases  using  the  mRNA  panel  described  in  Table  2.  (B)  Kaplan-Meier 
analysis  of  the  Gleason  score  7  cases  in  the  validation  set  using  the  mRNA  panel  (C)  Kaplan-Meier 
analysis  of  the  Gleason  score  7  cases  in  the  training  set  using  the  combined  mRNA  and  miRNA  panel 
described  in  Table  3.  (D)  Kaplan-Meier  analysis  of  the  Gleason  score  7  cases  in  the  validation  set 
using  the  combined  mRNA  and  miRNA  panel. 
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A 

Training  data  set  (n  =  61,  p  <  0.001) 
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Training  data  set  (n  =  61,  p  <  0.001) 
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Validation  data  set  (n  =  35,  p  =  0.010) 
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Validation  data  set  (n  =  19,  p  =  0.027) 
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